32 research outputs found
Probabilistic Auto-Associative Models and Semi-Linear PCA
Auto-Associative models cover a large class of methods used in data analysis.
In this paper, we describe the generals properties of these models when the
projection component is linear and we propose and test an easy to implement
Probabilistic Semi-Linear Auto- Associative model in a Gaussian setting. We
show it is a generalization of the PCA model to the semi-linear case. Numerical
experiments on simulated datasets and a real astronomical application highlight
the interest of this approac
Auto-Associative models and generalized Principal Component Analysis
International audienceIn this communication, we propose auto-associative (AA) models to generalize Principal component analysis (PCA). AA models have been introduced in data analysis from a geometrical point of view. They are based on the approximation of the observations scatter-plot by a differentiable manifold. They are interpreted as Projection pursuit models adapted to the auto-associative case. Their theoretical properties are established and are shown to extend the PCA ones. An iterative algorithm of construction is proposed and its principle is illustrated both on simulated and real data from image analysis
Block clustering of Binary Data with Gaussian Co-variables
The simultaneous grouping of rows and columns is an important technique that is increasingly used in large-scale data analysis. In this paper, we present a novel co-clustering method using co-variables in its construction. It is based on a latent block model taking into account the problem of grouping variables and clustering individuals by integrating information given by sets of co-variables. Numerical experiments on simulated data sets and an application on real genetic data highlight the interest of this approach
Estimation of Parsimonious Covariance Models for Gaussian Matrix Valued Random Variables for Multi-Dimensional Spectroscopic Data
International audienceSatellite remote sensing makes it possible to observe landscapes on large spatial scales. The Sentinel-1 and Sentinel-2 satellites currently provide full coverage of the national territory of France every 5 days. Due to the orbit of the satellites, coupled with the presence of clouds, thesampling of the pixels are temporally irregular. The project aims to develop, study and implement supervised and unsupervised classification methods when the data are of different natures (heterogeneous) and have missing and/or aberrant data. The methods implemented are developed to process satellite and aerial data for ecology and cartography
Rmixmod: The R Package of the Model-Based Unsupervised, Supervised and Semi-Supervised Classification Mixmod Library
International audienceMixmod is a well-established software package for fitting a mixture model of multivariate Gaussian or multinomial probability distribution functions to a given data set with either a clustering, a density estimation or a discriminant analysis purpose. The Rmixmod S4 package provides a bridge between the C++ core library of Mixmod (mixmodLib) and the R statistical computing environment. In this article, we give an overview of the model-based clustering and classification methods, and we show how the R package Rmixmod can be used for clustering and discriminant analysis